eye contact
Theatre Review: "An Ark" and "Data"
Two plays soaked in technological anxiety. "An Ark" resembles a webinar with a staring contest, one that no human can win. Before you enter "An Ark," a "mixed reality" performance at the Shed, you check your coat and, more oddly, your shoes. Inside, there are three concentric circles of chairs arranged on a red carpet and, overhead, a white globe resembling a hot-air balloon. A docent explained that, through my virtual-reality headset, I would see four more chairs--and, ideally, they shouldn't float.
A Robot-Assisted Approach to Small Talk Training for Adults with ASD
Ramnauth, Rebecca, Brščić, Dražen, Scassellati, Brian
--From dating to job interviews, making new friends or simply chatting with the cashier at checkout, engaging in small talk is a vital, everyday social skill. For adults with Autism Spectrum Disorder (ASD), small talk can be particularly challenging, yet it is essential for social integration, building relationships, and accessing professional opportunities. In this study, we present our development and evaluation of an in-home autonomous robot system that allows users to practice small talk. Results from the week-long study show that adults with ASD enjoyed the training, made notable progress in initiating conversations and improving eye contact, and viewed the system as a valuable tool for enhancing their conversational skills. Imagine a scene where three coworkers are engaging in small talk at the beginning of their workday. One of them is Alex, who has Autism Spectrum Disorder (ASD), a neu-rodevelopmental condition that often makes it challenging to understand and interpret social cues [2]. Not bad... just trying to power through this Monday. How about you, Alex? A: Monday is okay. Anything exciting happen this weekend? B: Y eah, I finally tried that new restaurant. It was fantas-- A: I watched a movie. C: Glad you liked the restaurant, Ben. It's my kids' favorite spot these days... What movie did you watch, Alex? A: "The Martian." At first glance, this brief example of a typical interaction appears unremarkable. It represents the everyday small talk that occurs regularly in many workplaces. For workers with ASD, however, such apparently "easy" interactions may present a real challenge. In this example, while Alex responds to direct questions, the responses are brief and lack elaboration. Alex's responses provide minimal information rather than actively participating in the flow of the conversation. Additionally, Alex's lack of response to the last prompt may suggest difficulty in extending or sustaining the dialogue. Although workers with ASD are often highly trained and skilled in job-specific tasks, they frequently face challenges with social interactions in the workplace.
Situated Haptic Interaction: Exploring the Role of Context in Affective Perception of Robotic Touch
Affective interaction is not merely about recognizing emotions; it is an embodied, situated process shaped by context and co-created through interaction. In affective computing, the role of haptic feedback within dynamic emotional exchanges remains underexplored. This study investigates how situational emotional cues influence the perception and interpretation of haptic signals given by a robot. In a controlled experiment, 32 participants watched video scenarios in which a robot experienced either positive actions (such as being kissed), negative actions (such as being slapped) or neutral actions. After each video, the robot conveyed its emotional response through haptic communication, delivered via a wearable vibration sleeve worn by the participant. Participants rated the robot's emotional state-its valence (positive or negative) and arousal (intensity)-based on the video, the haptic feedback, and the combination of the two. The study reveals a dynamic interplay between visual context and touch. Participants' interpretation of haptic feedback was strongly shaped by the emotional context of the video, with visual context often overriding the perceived valence of the haptic signal. Negative haptic cues amplified the perceived valence of the interaction, while positive cues softened it. Furthermore, haptics override the participants' perception of arousal of the video. Together, these results offer insights into how situated haptic feedback can enrich affective human-robot interaction, pointing toward more nuanced and embodied approaches to emotional communication with machines.
Gaze Behavior During a Long-Term, In-Home, Social Robot Intervention for Children with ASD
Ramnauth, Rebecca, Shic, Frederick, Scassellati, Brian
Atypical gaze behavior is a diagnostic hallmark of Autism Spectrum Disorder (ASD), playing a substantial role in the social and communicative challenges that individuals with ASD face. This study explores the impacts of a month-long, in-home intervention designed to promote triadic interactions between a social robot, a child with ASD, and their caregiver. Our results indicate that the intervention successfully promoted appropriate gaze behavior, encouraging children with ASD to follow the robot's gaze, resulting in more frequent and prolonged instances of spontaneous eye contact and joint attention with their caregivers. Additionally, we observed specific timelines for behavioral variability and novelty effects among users. Furthermore, diagnostic measures for ASD emerged as strong predictors of gaze patterns for both caregivers and children. These results deepen our understanding of ASD gaze patterns and highlight the potential for clinical relevance of robot-assisted interventions.
Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation
Ma, Cheng Charles, Joo, Kevin Hyekang, Vail, Alexandria K., Bhattacharya, Sunreeta, García, Álvaro Fernández, Baker-Matsuoka, Kailana, Mathew, Sheryl, Holt, Lori L., De la Torre, Fernando
Over the past decade, wearable computing devices (``smart glasses'') have undergone remarkable advancements in sensor technology, design, and processing power, ushering in a new era of opportunity for high-density human behavior data. Equipped with wearable cameras, these glasses offer a unique opportunity to analyze non-verbal behavior in natural settings as individuals interact. Our focus lies in predicting engagement in dyadic interactions by scrutinizing verbal and non-verbal cues, aiming to detect signs of disinterest or confusion. Leveraging such analyses may revolutionize our understanding of human communication, foster more effective collaboration in professional environments, provide better mental health support through empathetic virtual interactions, and enhance accessibility for those with communication barriers. In this work, we collect a dataset featuring 34 participants engaged in casual dyadic conversations, each providing self-reported engagement ratings at the end of each conversation. We introduce a novel fusion strategy using Large Language Models (LLMs) to integrate multiple behavior modalities into a ``multimodal transcript'' that can be processed by an LLM for behavioral reasoning tasks. Remarkably, this method achieves performance comparable to established fusion techniques even in its preliminary implementation, indicating strong potential for further research and optimization. This fusion method is one of the first to approach ``reasoning'' about real-world human behavior through a language model. Smart glasses provide us the ability to unobtrusively gather high-density multimodal data on human behavior, paving the way for new approaches to understanding and improving human communication with the potential for important societal benefits. The features and data collected during the studies will be made publicly available to promote further research.
Creativity and Visual Communication from Machine to Musician: Sharing a Score through a Robotic Camera
Greer, Ross, Fleig, Laura, Dubnov, Shlomo
This paper explores the integration of visual communication and musical interaction by implementing a robotic camera within a "Guided Harmony" musical game. We aim to examine co-creative behaviors between human musicians and robotic systems. Our research explores existing methodologies like improvisational game pieces and extends these concepts to include robotic participation using a PTZ camera. The robotic system interprets and responds to nonverbal cues from musicians, creating a collaborative and adaptive musical experience. This initial case study underscores the importance of intuitive visual communication channels. We also propose future research directions, including parameters for refining the visual cue toolkit and data collection methods to understand human-machine co-creativity further. Our findings contribute to the broader understanding of machine intelligence in augmenting human creativity, particularly in musical settings.
Desperate parents turn to magnetic therapy to help kids with autism. They have little evidence to go on
Thomas VanCott compares his son Jake's experience with autism to life on a tightrope. Upset the delicate balance and Jake, 18, plunges into frustration, slapping himself and twisting his neck in seemingly painful ways. Like many families with children on the autism spectrum, Jake's parents sought treatments beyond traditional speech and behavioral therapies. One that seemed promising was magnetic e-resonance therapy, or MERT, a magnetic brain stimulation therapy trademarked in 2016 by a Newport Beach-based company called Wave Neuroscience. The company licensed MERT to private clinics across the country that offered it as a therapy for conditions including depression, PTSD and autism. Those clinics described MERT as a noninvasive innovation that could improve an autistic child's sleep, social skills and -- most attractive to the VanCott family -- speech. It was expensive -- 9,000 -- and not covered by insurance.
Video-based Analysis Reveals Atypical Social Gaze in People with Autism Spectrum Disorder
Yu, Xiangxu, Ruan, Mindi, Hu, Chuanbo, Li, Wenqi, Paul, Lynn K., Li, Xin, Wang, Shuo
In this study, we present a quantitative and comprehensive analysis of social gaze in people with autism spectrum disorder (ASD). Diverging from traditional first-person camera perspectives based on eye-tracking technologies, this study utilizes a third-person perspective database from the Autism Diagnostic Observation Schedule, 2nd Edition (ADOS-2) interview videos, encompassing ASD participants and neurotypical individuals as a reference group. Employing computational models, we extracted and processed gaze-related features from the videos of both participants and examiners. The experimental samples were divided into three groups based on the presence of social gaze abnormalities and ASD diagnosis. This study quantitatively analyzed four gaze features: gaze engagement, gaze variance, gaze density map, and gaze diversion frequency. Furthermore, we developed a classifier trained on these features to identify gaze abnormalities in ASD participants. Together, we demonstrated the effectiveness of analyzing social gaze in people with ASD in naturalistic settings, showcasing the potential of third-person video perspectives in enhancing ASD diagnosis through gaze analysis.
Hear Me, See Me, Understand Me: Audio-Visual Autism Behavior Recognition
Deng, Shijian, Kosloski, Erin E., Patel, Siddhi, Barnett, Zeke A., Nan, Yiyang, Kaplan, Alexander, Aarukapalli, Sisira, Doan, William T., Wang, Matthew, Singh, Harsh, Rollins, Pamela R., Tian, Yapeng
In this article, we introduce a novel problem of audio-visual autism behavior recognition, which includes social behavior recognition, an essential aspect previously omitted in AI-assisted autism screening research. We define the task at hand as one that is audio-visual autism behavior recognition, which uses audio and visual cues, including any speech present in the audio, to recognize autism-related behaviors. To facilitate this new research direction, we collected an audio-visual autism spectrum dataset (AV-ASD), currently the largest video dataset for autism screening using a behavioral approach. It covers an extensive range of autism-associated behaviors, including those related to social communication and interaction. To pave the way for further research on this new problem, we intensively explored leveraging foundation models and multimodal large language models across different modalities. Our experiments on the AV-ASD dataset demonstrate that integrating audio, visual, and speech modalities significantly enhances the performance in autism behavior recognition. Additionally, we explored the use of a post-hoc to ad-hoc pipeline in a multimodal large language model to investigate its potential to augment the model's explanatory capability during autism behavior recognition. We will release our dataset, code, and pre-trained models.
Revealed: Why your brain waves dictate whether you click with someone - or can't stand them!
Does the secret to happy family bonds, lasting friendships, romantic bliss, academic and work success lie in getting our brain waves into sync with those of the people around us? That's the intriguing idea being raised by a wealth of research that investigates how our brain-wave activity can get into the same patterns (or in sync) with the brain waves of people we feel compatible with. Brain waves are electrical patterns that measure only millionths of a volt. There are five widely recognised ones -- alpha, beta, gamma, delta and theta -- and these are believed to regulate how we think and act. They can be detected by EEG (electroencephalogram, which analyses electrical activity in the brain) read-outs as our brains go about their everyday functions. For example, beta waves are thought to occur during most of our conscious, waking states, while alpha waves occur when we feel relaxed and thoughtful.